{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# `clean_date()`: Clean and validate date strings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Introduction\n",
    "\n",
    "The function `clean_date()` cleans a column containing date strings, and standardizes them in a desired format. The function `validate_date()` validates either a single date string or a column of date strings, returning \"cleaned\" standing for at the first stage the value is valid, and \"unknown\" otherwise. Note that the first stage means the initial format is correct. However, if the scenario like minite equals to 70 occurs, the function cannot immediately recognize this kind of error at first stage. They will be recognized during running of `clean_date()`. Also, this kind of error will not be cleaned by our function. \n",
    "\n",
    "Currently, many flexible date format like the following format are supported as valid input:\n",
    "\n",
    "* `1996.07.10 AD at 15:08:56 PDT`\n",
    "* `Tuesday, April 12, 1952 AD 3:30:42pm PST`\n",
    "* `2003 Sep 25`\t\n",
    "* `12:00am`\n",
    "* `Thu Sep 25 10:36:28 2003`\n",
    "\n",
    "Various delimiters between the digits are also allowed: \n",
    "`[\" \", \".\", \",\", \";\", \"-\", \"/\", \"'\", \"st\", \"nd\", \"rd\", \"th\", \"at\", \"on\", \"and\", \"ad\", \"AD\", \"of\"]`\n",
    "\n",
    "Phone numbers can be converted to the following formats via the `target_format` parameter. Also, users can specify many flexible target format like these:\n",
    "\n",
    "* `YYYY-MM-DD`\n",
    "* `yyyy.MM.dd AD at HH:mm:ss Z`\n",
    "* `EEE, d MMM yyyy HH:mm:ss Z`\n",
    "\n",
    "Users also can specify `origin_timezone` and `target_timezone` like `PDT`,`GMT` etc. When formatting the date, timezone will be transferred from origin timezone to target timezone.\n",
    "\n",
    "Invalid parsing is handled with the `fix_empty` parameter:\n",
    "\n",
    "* `auto_minimum` (default):\n",
    "    * For hours, minutes and seconds, just fill them with zeros\n",
    "    * For years, months and days, fill it with the minimum value\n",
    "* `empty`: just left the missing component as it is\n",
    "* `auto_nearest`:\n",
    "    * For hours, minutes and seconds, fill it with the nearest value\n",
    "    * For years, months and days, fill it with the nearest value\n",
    "\n",
    "After cleaning, a **report** is printed that provides the following information:\n",
    "\n",
    "* How many values were cleaned (the value must be transformed)\n",
    "* How many values could not be cleaned\n",
    "* And the data summary: how many values are in the correct format, and how many values are null\n",
    "\n",
    "The following sections demonstrate the functionality of `clean_date()` and `validate_date()`. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### An example dirty dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date\n",
       "0              1996.07.10 AD at 15:08:56 PDT\n",
       "1                   Thu Sep 25 10:36:28 2003\n",
       "2              Thu Sep 25 10:36:28 BRST 2003\n",
       "3              2003 10:36:28 BRST 25 Sep Thu\n",
       "4                   Thu Sep 25 10:36:28 2003\n",
       "5                               Thu 10:36:28\n",
       "6                                  Thu 10:36\n",
       "7                                      10:36\n",
       "8                            Thu Sep 25 2003\n",
       "9                                Sep 25 2003\n",
       "10                                  Sep 2003\n",
       "11                                       Sep\n",
       "12                                      2003\n",
       "13                                2003-09-25\n",
       "14                               2003-Sep-25\n",
       "15                               25-Sep-2003\n",
       "16                               Sep-25-2003\n",
       "17                                09-25-2003\n",
       "18                                10-09-2003\n",
       "19                                  10-09-03\n",
       "20                               2003.Sep.25\n",
       "21                                2003/09/25\n",
       "22                               2003 Sep 25\n",
       "23                                2003 09 25\n",
       "24                                      10pm\n",
       "25                                   12:00am\n",
       "26                                    Sep 03\n",
       "27                                 Sep of 03\n",
       "28                          Wed, July 10, 96\n",
       "29             1996.07.10 AD at 15:08:56 PDT\n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST\n",
       "31          November 5, 1994, 8:15:30 am EST\n",
       "32                           3rd of May 2001\n",
       "33                  5:50 AM on June 13, 1990\n",
       "34                                      NULL\n",
       "35                                       nan\n",
       "36                          I'm a little cat\n",
       "37                              This is Sep."
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "df = pd.DataFrame({\"date\":\n",
    "                   ['1996.07.10 AD at 15:08:56 PDT',\n",
    "                    'Thu Sep 25 10:36:28 2003',\n",
    "                    'Thu Sep 25 10:36:28 BRST 2003',\n",
    "                    '2003 10:36:28 BRST 25 Sep Thu',\n",
    "                    'Thu Sep 25 10:36:28 2003',\n",
    "                    'Thu 10:36:28',\n",
    "                    'Thu 10:36',\n",
    "                    '10:36',\n",
    "                    'Thu Sep 25 2003',\n",
    "                    'Sep 25 2003',\n",
    "                    'Sep 2003',\n",
    "                    'Sep',\n",
    "                    '2003',\n",
    "                    '2003-09-25',\n",
    "                    '2003-Sep-25',\n",
    "                    '25-Sep-2003',\n",
    "                    'Sep-25-2003',\n",
    "                    '09-25-2003',\n",
    "                    '10-09-2003',\n",
    "                    '10-09-03',\n",
    "                    '2003.Sep.25',\n",
    "                    '2003/09/25',\n",
    "                    '2003 Sep 25',\n",
    "                    '2003 09 25',\n",
    "                    '10pm',\n",
    "                    '12:00am',\n",
    "                    'Sep 03',\n",
    "                    'Sep of 03',\n",
    "                    'Wed, July 10, 96',\n",
    "                    '1996.07.10 AD at 15:08:56 PDT',\n",
    "                    'Tuesday, April 12, 1952 AD 3:30:42pm PST',\n",
    "                    'November 5, 1994, 8:15:30 am EST',\n",
    "                    '3rd of May 2001',\n",
    "                    '5:50 AM on June 13, 1990', \n",
    "                    'NULL',\n",
    "                    'nan',\n",
    "                    'I\\'m a little cat',\n",
    "                    'This is Sep.']})\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Default `clean_date()`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By default, the `target_format` parameter is set to \"YYYY-MM-DD hh:mm:ss\", the `origin_timezone` parameter is set to \"UTC\", the `fix_empty` parameter is set to \"auto_minimum\" and the `show_report` parameter is set to \"True\". And we don't specify the `target_timezone` parameter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t34 values cleaned (89.47%)\n",
      "\t2 values unable to be parsed (5.26%), set to NaN\n",
      "Result contains 34 (89.47%) values in the correct format and 4 null values (10.53%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10 15:08:56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "      <td>2000-01-01 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "      <td>2000-01-01 10:36:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "      <td>2000-01-01 10:36:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "      <td>2003-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "      <td>2000-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "      <td>2003-01-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "      <td>2003-10-09 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "      <td>2003-10-09 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "      <td>2000-01-01 22:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "      <td>2000-01-01 12:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "      <td>2003-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "      <td>2003-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "      <td>2096-07-10 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10 15:08:56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "      <td>1952-04-12 15:30:42</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "      <td>1994-11-05 08:15:30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>2001-05-03 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>1990-06-13 05:50:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date           date_clean\n",
       "0              1996.07.10 AD at 15:08:56 PDT  1996-07-10 15:08:56\n",
       "1                   Thu Sep 25 10:36:28 2003  2003-09-25 10:36:28\n",
       "2              Thu Sep 25 10:36:28 BRST 2003  2003-09-25 10:36:28\n",
       "3              2003 10:36:28 BRST 25 Sep Thu  2003-09-25 10:36:28\n",
       "4                   Thu Sep 25 10:36:28 2003  2003-09-25 10:36:28\n",
       "5                               Thu 10:36:28  2000-01-01 10:36:28\n",
       "6                                  Thu 10:36  2000-01-01 10:36:00\n",
       "7                                      10:36  2000-01-01 10:36:00\n",
       "8                            Thu Sep 25 2003  2003-09-25 00:00:00\n",
       "9                                Sep 25 2003  2003-09-25 00:00:00\n",
       "10                                  Sep 2003  2003-09-01 00:00:00\n",
       "11                                       Sep  2000-09-01 00:00:00\n",
       "12                                      2003  2003-01-01 00:00:00\n",
       "13                                2003-09-25  2003-09-25 00:00:00\n",
       "14                               2003-Sep-25  2003-09-25 00:00:00\n",
       "15                               25-Sep-2003  2003-09-25 00:00:00\n",
       "16                               Sep-25-2003  2003-09-25 00:00:00\n",
       "17                                09-25-2003  2003-09-25 00:00:00\n",
       "18                                10-09-2003  2003-10-09 00:00:00\n",
       "19                                  10-09-03  2003-10-09 00:00:00\n",
       "20                               2003.Sep.25  2003-09-25 00:00:00\n",
       "21                                2003/09/25  2003-09-25 00:00:00\n",
       "22                               2003 Sep 25  2003-09-25 00:00:00\n",
       "23                                2003 09 25  2003-09-25 00:00:00\n",
       "24                                      10pm  2000-01-01 22:00:00\n",
       "25                                   12:00am  2000-01-01 12:00:00\n",
       "26                                    Sep 03  2003-09-01 00:00:00\n",
       "27                                 Sep of 03  2003-09-01 00:00:00\n",
       "28                          Wed, July 10, 96  2096-07-10 00:00:00\n",
       "29             1996.07.10 AD at 15:08:56 PDT  1996-07-10 15:08:56\n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST  1952-04-12 15:30:42\n",
       "31          November 5, 1994, 8:15:30 am EST  1994-11-05 08:15:30\n",
       "32                           3rd of May 2001  2001-05-03 00:00:00\n",
       "33                  5:50 AM on June 13, 1990  1990-06-13 05:50:00\n",
       "34                                      NULL                  NaN\n",
       "35                                       nan                  NaN\n",
       "36                          I'm a little cat                  NaN\n",
       "37                              This is Sep.                  NaN"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from dataprep.clean import clean_date\n",
    "clean_date(df, 'date')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. `output_format` parameter\n",
    "This section demonstrate some valid target format. In fact, our function can support very flexible target formats, such as `YYYY-MM-DD` and `yyyy.MM.dd AD at HH:mm:ss z`. Users just need to specify tokens standing for year, month, day, hour, minute and second with valid separators. \n",
    "\n",
    "The tokens we support are listed in the following table.\n",
    "\n",
    "|  Component | Token |\n",
    "|  ----      | ----  |\n",
    "|  Year      | `\"yyyy\"(2015), \"yy\"(15), \"YYYY\"(2015), \"YY\"(15), \"Y\"(15), \"y\"(15)` |\n",
    "|  Month     | `\"MM\"(01), \"M\"(1), \"MMM\"(Jan.), \"MMMMM\"(January)` |\n",
    "|  Day       | `\"dd\"(05), \"d\"(5), \"DD\"(05), \"D\"(5)` |\n",
    "|  Hour      | `\"hh\"(06), \"h\"(6), \"HH\"(06), \"H\"(6)` |\n",
    "|  Minute    | `\"mm\"(08), \"m\"(8)` |\n",
    "|  Second    | `\"ss\"(09), \"s\"(9), \"SS\"(09), \"S\"(9)` |\n",
    "|  Weekday   | `\"eee\"(Mon.), \"EEE\"(Mon.), \"eeeee\"(Monday), \"EEEEE\"(Monday)` |\n",
    "|  Timezone  | `\"Z\"(UTC+00:00),'z'(GMT)` |\n",
    "\n",
    "The separators we support are listed here: `[\" \", \".\", \",\", \";\", \"-\", \"/\", \"'\", \"st\", \"nd\", \"rd\", \"th\", \"at\", \"on\", \"and\", \"ad\", \"AD\", \"of\"]`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example format: `YYYY-MM-DD`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t33 values cleaned (86.84%)\n",
      "\t2 values unable to be parsed (5.26%), set to NaN\n",
      "Result contains 34 (89.47%) values in the correct format and 4 null values (10.53%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "      <td>2000-01-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "      <td>2000-01-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "      <td>2000-01-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "      <td>2003-09-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "      <td>2000-09-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "      <td>2003-01-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "      <td>2003-10-09</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "      <td>2003-10-09</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "      <td>2003-09-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "      <td>2000-01-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "      <td>2000-01-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "      <td>2003-09-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "      <td>2003-09-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "      <td>2096-07-10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "      <td>1952-04-12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "      <td>1994-11-05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>2001-05-03</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>1990-06-13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date  date_clean\n",
       "0              1996.07.10 AD at 15:08:56 PDT  1996-07-10\n",
       "1                   Thu Sep 25 10:36:28 2003  2003-09-25\n",
       "2              Thu Sep 25 10:36:28 BRST 2003  2003-09-25\n",
       "3              2003 10:36:28 BRST 25 Sep Thu  2003-09-25\n",
       "4                   Thu Sep 25 10:36:28 2003  2003-09-25\n",
       "5                               Thu 10:36:28  2000-01-01\n",
       "6                                  Thu 10:36  2000-01-01\n",
       "7                                      10:36  2000-01-01\n",
       "8                            Thu Sep 25 2003  2003-09-25\n",
       "9                                Sep 25 2003  2003-09-25\n",
       "10                                  Sep 2003  2003-09-01\n",
       "11                                       Sep  2000-09-01\n",
       "12                                      2003  2003-01-01\n",
       "13                                2003-09-25  2003-09-25\n",
       "14                               2003-Sep-25  2003-09-25\n",
       "15                               25-Sep-2003  2003-09-25\n",
       "16                               Sep-25-2003  2003-09-25\n",
       "17                                09-25-2003  2003-09-25\n",
       "18                                10-09-2003  2003-10-09\n",
       "19                                  10-09-03  2003-10-09\n",
       "20                               2003.Sep.25  2003-09-25\n",
       "21                                2003/09/25  2003-09-25\n",
       "22                               2003 Sep 25  2003-09-25\n",
       "23                                2003 09 25  2003-09-25\n",
       "24                                      10pm  2000-01-01\n",
       "25                                   12:00am  2000-01-01\n",
       "26                                    Sep 03  2003-09-01\n",
       "27                                 Sep of 03  2003-09-01\n",
       "28                          Wed, July 10, 96  2096-07-10\n",
       "29             1996.07.10 AD at 15:08:56 PDT  1996-07-10\n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST  1952-04-12\n",
       "31          November 5, 1994, 8:15:30 am EST  1994-11-05\n",
       "32                           3rd of May 2001  2001-05-03\n",
       "33                  5:50 AM on June 13, 1990  1990-06-13\n",
       "34                                      NULL         NaN\n",
       "35                                       nan         NaN\n",
       "36                          I'm a little cat         NaN\n",
       "37                              This is Sep.         NaN"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clean_date(df, 'date', output_format='YYYY-MM-DD')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example format: `yyyy.MM.dd AD at HH:mm:ss Z`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t34 values cleaned (89.47%)\n",
      "\t2 values unable to be parsed (5.26%), set to NaN\n",
      "Result contains 34 (89.47%) values in the correct format and 4 null values (10.53%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996.07.10 AD at 15:08:56 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003.09.25 AD at 10:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "      <td>2003.09.25 AD at 10:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>2003.09.25 AD at 10:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003.09.25 AD at 10:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "      <td>2000.01.01 AD at 10:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "      <td>2000.01.01 AD at 10:36:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "      <td>2000.01.01 AD at 10:36:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "      <td>2003.09.01 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "      <td>2000.09.01 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "      <td>2003.01.01 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "      <td>2003.10.09 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "      <td>2003.10.09 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "      <td>2000.01.01 AD at 22:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "      <td>2000.01.01 AD at 12:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "      <td>2003.09.01 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "      <td>2003.09.01 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "      <td>2096.07.10 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996.07.10 AD at 15:08:56 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "      <td>1952.04.12 AD at 15:30:42 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "      <td>1994.11.05 AD at 08:15:30 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>2001.05.03 AD at 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>1990.06.13 AD at 05:50:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date  \\\n",
       "0              1996.07.10 AD at 15:08:56 PDT   \n",
       "1                   Thu Sep 25 10:36:28 2003   \n",
       "2              Thu Sep 25 10:36:28 BRST 2003   \n",
       "3              2003 10:36:28 BRST 25 Sep Thu   \n",
       "4                   Thu Sep 25 10:36:28 2003   \n",
       "5                               Thu 10:36:28   \n",
       "6                                  Thu 10:36   \n",
       "7                                      10:36   \n",
       "8                            Thu Sep 25 2003   \n",
       "9                                Sep 25 2003   \n",
       "10                                  Sep 2003   \n",
       "11                                       Sep   \n",
       "12                                      2003   \n",
       "13                                2003-09-25   \n",
       "14                               2003-Sep-25   \n",
       "15                               25-Sep-2003   \n",
       "16                               Sep-25-2003   \n",
       "17                                09-25-2003   \n",
       "18                                10-09-2003   \n",
       "19                                  10-09-03   \n",
       "20                               2003.Sep.25   \n",
       "21                                2003/09/25   \n",
       "22                               2003 Sep 25   \n",
       "23                                2003 09 25   \n",
       "24                                      10pm   \n",
       "25                                   12:00am   \n",
       "26                                    Sep 03   \n",
       "27                                 Sep of 03   \n",
       "28                          Wed, July 10, 96   \n",
       "29             1996.07.10 AD at 15:08:56 PDT   \n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST   \n",
       "31          November 5, 1994, 8:15:30 am EST   \n",
       "32                           3rd of May 2001   \n",
       "33                  5:50 AM on June 13, 1990   \n",
       "34                                      NULL   \n",
       "35                                       nan   \n",
       "36                          I'm a little cat   \n",
       "37                              This is Sep.   \n",
       "\n",
       "                             date_clean  \n",
       "0   1996.07.10 AD at 15:08:56 UTC+00:00  \n",
       "1   2003.09.25 AD at 10:36:28 UTC+00:00  \n",
       "2   2003.09.25 AD at 10:36:28 UTC+00:00  \n",
       "3   2003.09.25 AD at 10:36:28 UTC+00:00  \n",
       "4   2003.09.25 AD at 10:36:28 UTC+00:00  \n",
       "5   2000.01.01 AD at 10:36:28 UTC+00:00  \n",
       "6   2000.01.01 AD at 10:36:00 UTC+00:00  \n",
       "7   2000.01.01 AD at 10:36:00 UTC+00:00  \n",
       "8   2003.09.25 AD at 00:00:00 UTC+00:00  \n",
       "9   2003.09.25 AD at 00:00:00 UTC+00:00  \n",
       "10  2003.09.01 AD at 00:00:00 UTC+00:00  \n",
       "11  2000.09.01 AD at 00:00:00 UTC+00:00  \n",
       "12  2003.01.01 AD at 00:00:00 UTC+00:00  \n",
       "13  2003.09.25 AD at 00:00:00 UTC+00:00  \n",
       "14  2003.09.25 AD at 00:00:00 UTC+00:00  \n",
       "15  2003.09.25 AD at 00:00:00 UTC+00:00  \n",
       "16  2003.09.25 AD at 00:00:00 UTC+00:00  \n",
       "17  2003.09.25 AD at 00:00:00 UTC+00:00  \n",
       "18  2003.10.09 AD at 00:00:00 UTC+00:00  \n",
       "19  2003.10.09 AD at 00:00:00 UTC+00:00  \n",
       "20  2003.09.25 AD at 00:00:00 UTC+00:00  \n",
       "21  2003.09.25 AD at 00:00:00 UTC+00:00  \n",
       "22  2003.09.25 AD at 00:00:00 UTC+00:00  \n",
       "23  2003.09.25 AD at 00:00:00 UTC+00:00  \n",
       "24  2000.01.01 AD at 22:00:00 UTC+00:00  \n",
       "25  2000.01.01 AD at 12:00:00 UTC+00:00  \n",
       "26  2003.09.01 AD at 00:00:00 UTC+00:00  \n",
       "27  2003.09.01 AD at 00:00:00 UTC+00:00  \n",
       "28  2096.07.10 AD at 00:00:00 UTC+00:00  \n",
       "29  1996.07.10 AD at 15:08:56 UTC+00:00  \n",
       "30  1952.04.12 AD at 15:30:42 UTC+00:00  \n",
       "31  1994.11.05 AD at 08:15:30 UTC+00:00  \n",
       "32  2001.05.03 AD at 00:00:00 UTC+00:00  \n",
       "33  1990.06.13 AD at 05:50:00 UTC+00:00  \n",
       "34                                  NaN  \n",
       "35                                  NaN  \n",
       "36                                  NaN  \n",
       "37                                  NaN  "
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clean_date(df, 'date', output_format='yyyy.MM.dd AD at HH:mm:ss Z')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example format: `yyyy.MM.dd AD at HH:mm:ss z`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t34 values cleaned (89.47%)\n",
      "\t2 values unable to be parsed (5.26%), set to NaN\n",
      "Result contains 34 (89.47%) values in the correct format and 4 null values (10.53%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996.07.10 AD at 15:08:56 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003.09.25 AD at 10:36:28 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "      <td>2003.09.25 AD at 10:36:28 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>2003.09.25 AD at 10:36:28 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003.09.25 AD at 10:36:28 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "      <td>2000.01.01 AD at 10:36:28 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "      <td>2000.01.01 AD at 10:36:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "      <td>2000.01.01 AD at 10:36:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "      <td>2003.09.01 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "      <td>2000.09.01 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "      <td>2003.01.01 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "      <td>2003.10.09 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "      <td>2003.10.09 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "      <td>2003.09.25 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "      <td>2000.01.01 AD at 22:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "      <td>2000.01.01 AD at 12:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "      <td>2003.09.01 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "      <td>2003.09.01 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "      <td>2096.07.10 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996.07.10 AD at 15:08:56 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "      <td>1952.04.12 AD at 15:30:42 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "      <td>1994.11.05 AD at 08:15:30 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>2001.05.03 AD at 00:00:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>1990.06.13 AD at 05:50:00 UTC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date                     date_clean\n",
       "0              1996.07.10 AD at 15:08:56 PDT  1996.07.10 AD at 15:08:56 UTC\n",
       "1                   Thu Sep 25 10:36:28 2003  2003.09.25 AD at 10:36:28 UTC\n",
       "2              Thu Sep 25 10:36:28 BRST 2003  2003.09.25 AD at 10:36:28 UTC\n",
       "3              2003 10:36:28 BRST 25 Sep Thu  2003.09.25 AD at 10:36:28 UTC\n",
       "4                   Thu Sep 25 10:36:28 2003  2003.09.25 AD at 10:36:28 UTC\n",
       "5                               Thu 10:36:28  2000.01.01 AD at 10:36:28 UTC\n",
       "6                                  Thu 10:36  2000.01.01 AD at 10:36:00 UTC\n",
       "7                                      10:36  2000.01.01 AD at 10:36:00 UTC\n",
       "8                            Thu Sep 25 2003  2003.09.25 AD at 00:00:00 UTC\n",
       "9                                Sep 25 2003  2003.09.25 AD at 00:00:00 UTC\n",
       "10                                  Sep 2003  2003.09.01 AD at 00:00:00 UTC\n",
       "11                                       Sep  2000.09.01 AD at 00:00:00 UTC\n",
       "12                                      2003  2003.01.01 AD at 00:00:00 UTC\n",
       "13                                2003-09-25  2003.09.25 AD at 00:00:00 UTC\n",
       "14                               2003-Sep-25  2003.09.25 AD at 00:00:00 UTC\n",
       "15                               25-Sep-2003  2003.09.25 AD at 00:00:00 UTC\n",
       "16                               Sep-25-2003  2003.09.25 AD at 00:00:00 UTC\n",
       "17                                09-25-2003  2003.09.25 AD at 00:00:00 UTC\n",
       "18                                10-09-2003  2003.10.09 AD at 00:00:00 UTC\n",
       "19                                  10-09-03  2003.10.09 AD at 00:00:00 UTC\n",
       "20                               2003.Sep.25  2003.09.25 AD at 00:00:00 UTC\n",
       "21                                2003/09/25  2003.09.25 AD at 00:00:00 UTC\n",
       "22                               2003 Sep 25  2003.09.25 AD at 00:00:00 UTC\n",
       "23                                2003 09 25  2003.09.25 AD at 00:00:00 UTC\n",
       "24                                      10pm  2000.01.01 AD at 22:00:00 UTC\n",
       "25                                   12:00am  2000.01.01 AD at 12:00:00 UTC\n",
       "26                                    Sep 03  2003.09.01 AD at 00:00:00 UTC\n",
       "27                                 Sep of 03  2003.09.01 AD at 00:00:00 UTC\n",
       "28                          Wed, July 10, 96  2096.07.10 AD at 00:00:00 UTC\n",
       "29             1996.07.10 AD at 15:08:56 PDT  1996.07.10 AD at 15:08:56 UTC\n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST  1952.04.12 AD at 15:30:42 UTC\n",
       "31          November 5, 1994, 8:15:30 am EST  1994.11.05 AD at 08:15:30 UTC\n",
       "32                           3rd of May 2001  2001.05.03 AD at 00:00:00 UTC\n",
       "33                  5:50 AM on June 13, 1990  1990.06.13 AD at 05:50:00 UTC\n",
       "34                                      NULL                            NaN\n",
       "35                                       nan                            NaN\n",
       "36                          I'm a little cat                            NaN\n",
       "37                              This is Sep.                            NaN"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clean_date(df, 'date', output_format='yyyy.MM.dd AD at HH:mm:ss z')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example format: `EEE, d MMM yyyy HH:mm:ss Z`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t34 values cleaned (89.47%)\n",
      "\t2 values unable to be parsed (5.26%), set to NaN\n",
      "Result contains 34 (89.47%) values in the correct format and 4 null values (10.53%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>Wed, 10 Jul 1996 15:08:56 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>Thu, 25 Sep 2003 10:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "      <td>Thu, 25 Sep 2003 10:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>Thu, 25 Sep 2003 10:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>Thu, 25 Sep 2003 10:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "      <td>Thu, 1 Jan 2000 10:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "      <td>Thu, 1 Jan 2000 10:36:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "      <td>Sat, 1 Jan 2000 10:36:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "      <td>Thu, 25 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "      <td>Thu, 25 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "      <td>Mon, 1 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "      <td>Fri, 1 Sep 2000 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "      <td>Wed, 1 Jan 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "      <td>Thu, 25 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>Thu, 25 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "      <td>Thu, 25 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "      <td>Thu, 25 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "      <td>Thu, 25 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "      <td>Thu, 9 Oct 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "      <td>Thu, 9 Oct 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "      <td>Thu, 25 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "      <td>Thu, 25 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "      <td>Thu, 25 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "      <td>Thu, 25 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "      <td>Sat, 1 Jan 2000 22:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "      <td>Sat, 1 Jan 2000 12:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "      <td>Mon, 1 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "      <td>Mon, 1 Sep 2003 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "      <td>Wed, 10 Jul 2096 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>Wed, 10 Jul 1996 15:08:56 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "      <td>Tue, 12 Apr 1952 15:30:42 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "      <td>Sat, 5 Nov 1994 08:15:30 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>Thu, 3 May 2001 00:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>Wed, 13 Jun 1990 05:50:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date  \\\n",
       "0              1996.07.10 AD at 15:08:56 PDT   \n",
       "1                   Thu Sep 25 10:36:28 2003   \n",
       "2              Thu Sep 25 10:36:28 BRST 2003   \n",
       "3              2003 10:36:28 BRST 25 Sep Thu   \n",
       "4                   Thu Sep 25 10:36:28 2003   \n",
       "5                               Thu 10:36:28   \n",
       "6                                  Thu 10:36   \n",
       "7                                      10:36   \n",
       "8                            Thu Sep 25 2003   \n",
       "9                                Sep 25 2003   \n",
       "10                                  Sep 2003   \n",
       "11                                       Sep   \n",
       "12                                      2003   \n",
       "13                                2003-09-25   \n",
       "14                               2003-Sep-25   \n",
       "15                               25-Sep-2003   \n",
       "16                               Sep-25-2003   \n",
       "17                                09-25-2003   \n",
       "18                                10-09-2003   \n",
       "19                                  10-09-03   \n",
       "20                               2003.Sep.25   \n",
       "21                                2003/09/25   \n",
       "22                               2003 Sep 25   \n",
       "23                                2003 09 25   \n",
       "24                                      10pm   \n",
       "25                                   12:00am   \n",
       "26                                    Sep 03   \n",
       "27                                 Sep of 03   \n",
       "28                          Wed, July 10, 96   \n",
       "29             1996.07.10 AD at 15:08:56 PDT   \n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST   \n",
       "31          November 5, 1994, 8:15:30 am EST   \n",
       "32                           3rd of May 2001   \n",
       "33                  5:50 AM on June 13, 1990   \n",
       "34                                      NULL   \n",
       "35                                       nan   \n",
       "36                          I'm a little cat   \n",
       "37                              This is Sep.   \n",
       "\n",
       "                             date_clean  \n",
       "0   Wed, 10 Jul 1996 15:08:56 UTC+00:00  \n",
       "1   Thu, 25 Sep 2003 10:36:28 UTC+00:00  \n",
       "2   Thu, 25 Sep 2003 10:36:28 UTC+00:00  \n",
       "3   Thu, 25 Sep 2003 10:36:28 UTC+00:00  \n",
       "4   Thu, 25 Sep 2003 10:36:28 UTC+00:00  \n",
       "5    Thu, 1 Jan 2000 10:36:28 UTC+00:00  \n",
       "6    Thu, 1 Jan 2000 10:36:00 UTC+00:00  \n",
       "7    Sat, 1 Jan 2000 10:36:00 UTC+00:00  \n",
       "8   Thu, 25 Sep 2003 00:00:00 UTC+00:00  \n",
       "9   Thu, 25 Sep 2003 00:00:00 UTC+00:00  \n",
       "10   Mon, 1 Sep 2003 00:00:00 UTC+00:00  \n",
       "11   Fri, 1 Sep 2000 00:00:00 UTC+00:00  \n",
       "12   Wed, 1 Jan 2003 00:00:00 UTC+00:00  \n",
       "13  Thu, 25 Sep 2003 00:00:00 UTC+00:00  \n",
       "14  Thu, 25 Sep 2003 00:00:00 UTC+00:00  \n",
       "15  Thu, 25 Sep 2003 00:00:00 UTC+00:00  \n",
       "16  Thu, 25 Sep 2003 00:00:00 UTC+00:00  \n",
       "17  Thu, 25 Sep 2003 00:00:00 UTC+00:00  \n",
       "18   Thu, 9 Oct 2003 00:00:00 UTC+00:00  \n",
       "19   Thu, 9 Oct 2003 00:00:00 UTC+00:00  \n",
       "20  Thu, 25 Sep 2003 00:00:00 UTC+00:00  \n",
       "21  Thu, 25 Sep 2003 00:00:00 UTC+00:00  \n",
       "22  Thu, 25 Sep 2003 00:00:00 UTC+00:00  \n",
       "23  Thu, 25 Sep 2003 00:00:00 UTC+00:00  \n",
       "24   Sat, 1 Jan 2000 22:00:00 UTC+00:00  \n",
       "25   Sat, 1 Jan 2000 12:00:00 UTC+00:00  \n",
       "26   Mon, 1 Sep 2003 00:00:00 UTC+00:00  \n",
       "27   Mon, 1 Sep 2003 00:00:00 UTC+00:00  \n",
       "28  Wed, 10 Jul 2096 00:00:00 UTC+00:00  \n",
       "29  Wed, 10 Jul 1996 15:08:56 UTC+00:00  \n",
       "30  Tue, 12 Apr 1952 15:30:42 UTC+00:00  \n",
       "31   Sat, 5 Nov 1994 08:15:30 UTC+00:00  \n",
       "32   Thu, 3 May 2001 00:00:00 UTC+00:00  \n",
       "33  Wed, 13 Jun 1990 05:50:00 UTC+00:00  \n",
       "34                                  NaN  \n",
       "35                                  NaN  \n",
       "36                                  NaN  \n",
       "37                                  NaN  "
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clean_date(df, 'date', output_format='EEE, d MMM yyyy HH:mm:ss Z')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. `input_timezone` and `output_timezone` parameter\n",
    "This section demostrates valide origin timezones and target timezones. `input_timezone` in our function means user-specified timezone for input data. `output_timezone` in our function means user-specified timezone for output data.\n",
    "\n",
    "In our function, the range of `input_timezone` and `output_timezone` includes two parts:\n",
    "* All timezones in `pytz.all_timezones`\n",
    "* Abbreviation for common-used timezones\n",
    "\n",
    "|  Timezone Name | UTC offset |\n",
    "|  ----      | ----  |\n",
    "|UTC             | 0          |\n",
    "|ACT| -5|\n",
    "|ADT|-3|\n",
    "|AEDT|11|\n",
    "|AEST|10|\n",
    "|AKDT|-8|\n",
    "|AKST|-9|\n",
    "|AMST|-3|\n",
    "|AMT|-4|\n",
    "|ART|-3|\n",
    "|ArabiaST|3|\n",
    "|AtlanticST|-4|\n",
    "|AWST|8|\n",
    "|AZOST|0|\n",
    "|AZOT|0|\n",
    "|BOT|-4|\n",
    "|BRST|-2|\n",
    "|BRT|-3|\n",
    "|BST|1|\n",
    "|BTT|6|\n",
    "|CAT|2|\n",
    "|CDT|-5|\n",
    "|CEST|2|\n",
    "|CET|1|\n",
    "|CHOST|9|\n",
    "|CHOT|8|\n",
    "|CHUT|10|\n",
    "|CKT|-10|\n",
    "|CLST|-3|\n",
    "|CLT|-4|\n",
    "|CentralST|-6|\n",
    "|ChinaST|8|\n",
    "|CubaST|-5|\n",
    "|ChST|10|\n",
    "|EASST|-5|\n",
    "|EAST|-6|\n",
    "|EAT|3|\n",
    "|ECT|-5|\n",
    "|EDT|-4|\n",
    "|EEST|3|\n",
    "|EET|2|\n",
    "|EST|-5|\n",
    "|FKST|-3|\n",
    "|GFT|-3|\n",
    "|GILT|12|\n",
    "|GMT|0|\n",
    "|GST|4|\n",
    "|HKT|8|\n",
    "|HST|-10|\n",
    "|ICT|7|\n",
    "|IDT|3|\n",
    "|IrishST|1|\n",
    "|IsraelST|2|\n",
    "|JST|9|\n",
    "|KOST|11|\n",
    "|LINT|4|\n",
    "|MDT|-6|\n",
    "|MHT|12|\n",
    "|MSK|3|\n",
    "|MST|-7|\n",
    "|MYT|8|\n",
    "|NUT|-11|\n",
    "|NZDT|13|\n",
    "|NZST|12|\n",
    "|PDT|-7|\n",
    "|PET|-5|\n",
    "|PGT|10|\n",
    "|PHT|8|\n",
    "|PONT|11|\n",
    "|PST|-8|\n",
    "|SAST|2|\n",
    "|SBT|11|\n",
    "|SGT|8|\n",
    "|SRT|-3|\n",
    "|SST|-11|\n",
    "|TAHT|-10|\n",
    "|TLT|9|\n",
    "|TVT|12|\n",
    "|ULAST|9|\n",
    "|ULAT|8|\n",
    "|UYST|-2|\n",
    "|UYT|-3|\n",
    "|VET|-4|\n",
    "|WAST|2|\n",
    "|WAT|1|\n",
    "|WEST|1|\n",
    "|WET|0|\n",
    "|WIB|7|\n",
    "|WIT|9|\n",
    "|WITA|8|"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example format: \n",
    "`input_timezone`: `PDT`\n",
    "\n",
    "`output_timezone`: `ChinaST`\n",
    "\n",
    "`output_format`: `yyyy.MM.dd AD at HH:mm:ss Z`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t34 values cleaned (89.47%)\n",
      "\t2 values unable to be parsed (5.26%), set to NaN\n",
      "Result contains 34 (89.47%) values in the correct format and 4 null values (10.53%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996.07.11 AD at 06:08:56 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003.09.26 AD at 01:36:28 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "      <td>2003.09.26 AD at 01:36:28 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>2003.09.26 AD at 01:36:28 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003.09.26 AD at 01:36:28 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "      <td>2000.01.02 AD at 01:36:28 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "      <td>2000.01.02 AD at 01:36:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "      <td>2000.01.02 AD at 01:36:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "      <td>2003.09.25 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "      <td>2003.09.25 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "      <td>2003.09.01 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "      <td>2000.09.01 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "      <td>2003.01.01 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "      <td>2003.09.25 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>2003.09.25 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "      <td>2003.09.25 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "      <td>2003.09.25 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "      <td>2003.09.25 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "      <td>2003.10.09 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "      <td>2003.10.09 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "      <td>2003.09.25 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "      <td>2003.09.25 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "      <td>2003.09.25 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "      <td>2003.09.25 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "      <td>2000.01.02 AD at 13:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "      <td>2000.01.02 AD at 03:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "      <td>2003.09.01 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "      <td>2003.09.01 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "      <td>2096.07.10 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996.07.11 AD at 06:08:56 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "      <td>1952.04.13 AD at 06:30:42 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "      <td>1994.11.05 AD at 23:15:30 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>2001.05.03 AD at 15:00:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>1990.06.13 AD at 20:50:00 UTC+08:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date  \\\n",
       "0              1996.07.10 AD at 15:08:56 PDT   \n",
       "1                   Thu Sep 25 10:36:28 2003   \n",
       "2              Thu Sep 25 10:36:28 BRST 2003   \n",
       "3              2003 10:36:28 BRST 25 Sep Thu   \n",
       "4                   Thu Sep 25 10:36:28 2003   \n",
       "5                               Thu 10:36:28   \n",
       "6                                  Thu 10:36   \n",
       "7                                      10:36   \n",
       "8                            Thu Sep 25 2003   \n",
       "9                                Sep 25 2003   \n",
       "10                                  Sep 2003   \n",
       "11                                       Sep   \n",
       "12                                      2003   \n",
       "13                                2003-09-25   \n",
       "14                               2003-Sep-25   \n",
       "15                               25-Sep-2003   \n",
       "16                               Sep-25-2003   \n",
       "17                                09-25-2003   \n",
       "18                                10-09-2003   \n",
       "19                                  10-09-03   \n",
       "20                               2003.Sep.25   \n",
       "21                                2003/09/25   \n",
       "22                               2003 Sep 25   \n",
       "23                                2003 09 25   \n",
       "24                                      10pm   \n",
       "25                                   12:00am   \n",
       "26                                    Sep 03   \n",
       "27                                 Sep of 03   \n",
       "28                          Wed, July 10, 96   \n",
       "29             1996.07.10 AD at 15:08:56 PDT   \n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST   \n",
       "31          November 5, 1994, 8:15:30 am EST   \n",
       "32                           3rd of May 2001   \n",
       "33                  5:50 AM on June 13, 1990   \n",
       "34                                      NULL   \n",
       "35                                       nan   \n",
       "36                          I'm a little cat   \n",
       "37                              This is Sep.   \n",
       "\n",
       "                             date_clean  \n",
       "0   1996.07.11 AD at 06:08:56 UTC+08:00  \n",
       "1   2003.09.26 AD at 01:36:28 UTC+08:00  \n",
       "2   2003.09.26 AD at 01:36:28 UTC+08:00  \n",
       "3   2003.09.26 AD at 01:36:28 UTC+08:00  \n",
       "4   2003.09.26 AD at 01:36:28 UTC+08:00  \n",
       "5   2000.01.02 AD at 01:36:28 UTC+08:00  \n",
       "6   2000.01.02 AD at 01:36:00 UTC+08:00  \n",
       "7   2000.01.02 AD at 01:36:00 UTC+08:00  \n",
       "8   2003.09.25 AD at 15:00:00 UTC+08:00  \n",
       "9   2003.09.25 AD at 15:00:00 UTC+08:00  \n",
       "10  2003.09.01 AD at 15:00:00 UTC+08:00  \n",
       "11  2000.09.01 AD at 15:00:00 UTC+08:00  \n",
       "12  2003.01.01 AD at 15:00:00 UTC+08:00  \n",
       "13  2003.09.25 AD at 15:00:00 UTC+08:00  \n",
       "14  2003.09.25 AD at 15:00:00 UTC+08:00  \n",
       "15  2003.09.25 AD at 15:00:00 UTC+08:00  \n",
       "16  2003.09.25 AD at 15:00:00 UTC+08:00  \n",
       "17  2003.09.25 AD at 15:00:00 UTC+08:00  \n",
       "18  2003.10.09 AD at 15:00:00 UTC+08:00  \n",
       "19  2003.10.09 AD at 15:00:00 UTC+08:00  \n",
       "20  2003.09.25 AD at 15:00:00 UTC+08:00  \n",
       "21  2003.09.25 AD at 15:00:00 UTC+08:00  \n",
       "22  2003.09.25 AD at 15:00:00 UTC+08:00  \n",
       "23  2003.09.25 AD at 15:00:00 UTC+08:00  \n",
       "24  2000.01.02 AD at 13:00:00 UTC+08:00  \n",
       "25  2000.01.02 AD at 03:00:00 UTC+08:00  \n",
       "26  2003.09.01 AD at 15:00:00 UTC+08:00  \n",
       "27  2003.09.01 AD at 15:00:00 UTC+08:00  \n",
       "28  2096.07.10 AD at 15:00:00 UTC+08:00  \n",
       "29  1996.07.11 AD at 06:08:56 UTC+08:00  \n",
       "30  1952.04.13 AD at 06:30:42 UTC+08:00  \n",
       "31  1994.11.05 AD at 23:15:30 UTC+08:00  \n",
       "32  2001.05.03 AD at 15:00:00 UTC+08:00  \n",
       "33  1990.06.13 AD at 20:50:00 UTC+08:00  \n",
       "34                                  NaN  \n",
       "35                                  NaN  \n",
       "36                                  NaN  \n",
       "37                                  NaN  "
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clean_date(df, 'date', input_timezone='PDT', output_timezone='ChinaST',output_format='yyyy.MM.dd AD at HH:mm:ss Z')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example format: \n",
    "`input_timezone`: `EST`\n",
    "\n",
    "`output_timezone`: `PDT`\n",
    "\n",
    "`output_format`: `yyyy.MM.dd AD at HH:mm:ss Z`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t34 values cleaned (89.47%)\n",
      "\t2 values unable to be parsed (5.26%), set to NaN\n",
      "Result contains 34 (89.47%) values in the correct format and 4 null values (10.53%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996.07.12 AD at 03:08:56 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003.09.26 AD at 22:36:28 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "      <td>2003.09.26 AD at 22:36:28 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>2003.09.26 AD at 22:36:28 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003.09.26 AD at 22:36:28 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "      <td>2000.01.02 AD at 22:36:28 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "      <td>2000.01.02 AD at 22:36:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "      <td>2000.01.02 AD at 22:36:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "      <td>2003.09.26 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "      <td>2003.09.26 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "      <td>2003.09.02 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "      <td>2000.09.02 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "      <td>2003.01.02 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "      <td>2003.09.26 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>2003.09.26 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "      <td>2003.09.26 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "      <td>2003.09.26 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "      <td>2003.09.26 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "      <td>2003.10.10 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "      <td>2003.10.10 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "      <td>2003.09.26 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "      <td>2003.09.26 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "      <td>2003.09.26 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "      <td>2003.09.26 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "      <td>2000.01.03 AD at 10:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "      <td>2000.01.03 AD at 00:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "      <td>2003.09.02 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "      <td>2003.09.02 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "      <td>2096.07.11 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996.07.12 AD at 03:08:56 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "      <td>1952.04.14 AD at 03:30:42 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "      <td>1994.11.06 AD at 20:15:30 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>2001.05.04 AD at 12:00:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>1990.06.14 AD at 17:50:00 UTC-07:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date  \\\n",
       "0              1996.07.10 AD at 15:08:56 PDT   \n",
       "1                   Thu Sep 25 10:36:28 2003   \n",
       "2              Thu Sep 25 10:36:28 BRST 2003   \n",
       "3              2003 10:36:28 BRST 25 Sep Thu   \n",
       "4                   Thu Sep 25 10:36:28 2003   \n",
       "5                               Thu 10:36:28   \n",
       "6                                  Thu 10:36   \n",
       "7                                      10:36   \n",
       "8                            Thu Sep 25 2003   \n",
       "9                                Sep 25 2003   \n",
       "10                                  Sep 2003   \n",
       "11                                       Sep   \n",
       "12                                      2003   \n",
       "13                                2003-09-25   \n",
       "14                               2003-Sep-25   \n",
       "15                               25-Sep-2003   \n",
       "16                               Sep-25-2003   \n",
       "17                                09-25-2003   \n",
       "18                                10-09-2003   \n",
       "19                                  10-09-03   \n",
       "20                               2003.Sep.25   \n",
       "21                                2003/09/25   \n",
       "22                               2003 Sep 25   \n",
       "23                                2003 09 25   \n",
       "24                                      10pm   \n",
       "25                                   12:00am   \n",
       "26                                    Sep 03   \n",
       "27                                 Sep of 03   \n",
       "28                          Wed, July 10, 96   \n",
       "29             1996.07.10 AD at 15:08:56 PDT   \n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST   \n",
       "31          November 5, 1994, 8:15:30 am EST   \n",
       "32                           3rd of May 2001   \n",
       "33                  5:50 AM on June 13, 1990   \n",
       "34                                      NULL   \n",
       "35                                       nan   \n",
       "36                          I'm a little cat   \n",
       "37                              This is Sep.   \n",
       "\n",
       "                             date_clean  \n",
       "0   1996.07.12 AD at 03:08:56 UTC-07:00  \n",
       "1   2003.09.26 AD at 22:36:28 UTC-07:00  \n",
       "2   2003.09.26 AD at 22:36:28 UTC-07:00  \n",
       "3   2003.09.26 AD at 22:36:28 UTC-07:00  \n",
       "4   2003.09.26 AD at 22:36:28 UTC-07:00  \n",
       "5   2000.01.02 AD at 22:36:28 UTC-07:00  \n",
       "6   2000.01.02 AD at 22:36:00 UTC-07:00  \n",
       "7   2000.01.02 AD at 22:36:00 UTC-07:00  \n",
       "8   2003.09.26 AD at 12:00:00 UTC-07:00  \n",
       "9   2003.09.26 AD at 12:00:00 UTC-07:00  \n",
       "10  2003.09.02 AD at 12:00:00 UTC-07:00  \n",
       "11  2000.09.02 AD at 12:00:00 UTC-07:00  \n",
       "12  2003.01.02 AD at 12:00:00 UTC-07:00  \n",
       "13  2003.09.26 AD at 12:00:00 UTC-07:00  \n",
       "14  2003.09.26 AD at 12:00:00 UTC-07:00  \n",
       "15  2003.09.26 AD at 12:00:00 UTC-07:00  \n",
       "16  2003.09.26 AD at 12:00:00 UTC-07:00  \n",
       "17  2003.09.26 AD at 12:00:00 UTC-07:00  \n",
       "18  2003.10.10 AD at 12:00:00 UTC-07:00  \n",
       "19  2003.10.10 AD at 12:00:00 UTC-07:00  \n",
       "20  2003.09.26 AD at 12:00:00 UTC-07:00  \n",
       "21  2003.09.26 AD at 12:00:00 UTC-07:00  \n",
       "22  2003.09.26 AD at 12:00:00 UTC-07:00  \n",
       "23  2003.09.26 AD at 12:00:00 UTC-07:00  \n",
       "24  2000.01.03 AD at 10:00:00 UTC-07:00  \n",
       "25  2000.01.03 AD at 00:00:00 UTC-07:00  \n",
       "26  2003.09.02 AD at 12:00:00 UTC-07:00  \n",
       "27  2003.09.02 AD at 12:00:00 UTC-07:00  \n",
       "28  2096.07.11 AD at 12:00:00 UTC-07:00  \n",
       "29  1996.07.12 AD at 03:08:56 UTC-07:00  \n",
       "30  1952.04.14 AD at 03:30:42 UTC-07:00  \n",
       "31  1994.11.06 AD at 20:15:30 UTC-07:00  \n",
       "32  2001.05.04 AD at 12:00:00 UTC-07:00  \n",
       "33  1990.06.14 AD at 17:50:00 UTC-07:00  \n",
       "34                                  NaN  \n",
       "35                                  NaN  \n",
       "36                                  NaN  \n",
       "37                                  NaN  "
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clean_date(df, 'date', input_timezone='EST', output_timezone='PDT',output_format='yyyy.MM.dd AD at HH:mm:ss Z')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example format: \n",
    "`input_timezone`: `PST`\n",
    "\n",
    "`output_timezone`: `GMT`\n",
    "\n",
    "`output_format`: `yyyy.MM.dd AD at HH:mm:ss Z`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t34 values cleaned (89.47%)\n",
      "\t2 values unable to be parsed (5.26%), set to NaN\n",
      "Result contains 34 (89.47%) values in the correct format and 4 null values (10.53%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996.07.10 AD at 23:08:56 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003.09.25 AD at 18:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "      <td>2003.09.25 AD at 18:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>2003.09.25 AD at 18:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003.09.25 AD at 18:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "      <td>2000.01.01 AD at 18:36:28 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "      <td>2000.01.01 AD at 18:36:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "      <td>2000.01.01 AD at 18:36:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "      <td>2003.09.25 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "      <td>2003.09.25 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "      <td>2003.09.01 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "      <td>2000.09.01 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "      <td>2003.01.01 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "      <td>2003.09.25 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>2003.09.25 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "      <td>2003.09.25 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "      <td>2003.09.25 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "      <td>2003.09.25 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "      <td>2003.10.09 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "      <td>2003.10.09 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "      <td>2003.09.25 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "      <td>2003.09.25 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "      <td>2003.09.25 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "      <td>2003.09.25 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "      <td>2000.01.02 AD at 06:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "      <td>2000.01.01 AD at 20:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "      <td>2003.09.01 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "      <td>2003.09.01 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "      <td>2096.07.10 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996.07.10 AD at 23:08:56 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "      <td>1952.04.12 AD at 23:30:42 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "      <td>1994.11.05 AD at 16:15:30 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>2001.05.03 AD at 08:00:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>1990.06.13 AD at 13:50:00 UTC+00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date  \\\n",
       "0              1996.07.10 AD at 15:08:56 PDT   \n",
       "1                   Thu Sep 25 10:36:28 2003   \n",
       "2              Thu Sep 25 10:36:28 BRST 2003   \n",
       "3              2003 10:36:28 BRST 25 Sep Thu   \n",
       "4                   Thu Sep 25 10:36:28 2003   \n",
       "5                               Thu 10:36:28   \n",
       "6                                  Thu 10:36   \n",
       "7                                      10:36   \n",
       "8                            Thu Sep 25 2003   \n",
       "9                                Sep 25 2003   \n",
       "10                                  Sep 2003   \n",
       "11                                       Sep   \n",
       "12                                      2003   \n",
       "13                                2003-09-25   \n",
       "14                               2003-Sep-25   \n",
       "15                               25-Sep-2003   \n",
       "16                               Sep-25-2003   \n",
       "17                                09-25-2003   \n",
       "18                                10-09-2003   \n",
       "19                                  10-09-03   \n",
       "20                               2003.Sep.25   \n",
       "21                                2003/09/25   \n",
       "22                               2003 Sep 25   \n",
       "23                                2003 09 25   \n",
       "24                                      10pm   \n",
       "25                                   12:00am   \n",
       "26                                    Sep 03   \n",
       "27                                 Sep of 03   \n",
       "28                          Wed, July 10, 96   \n",
       "29             1996.07.10 AD at 15:08:56 PDT   \n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST   \n",
       "31          November 5, 1994, 8:15:30 am EST   \n",
       "32                           3rd of May 2001   \n",
       "33                  5:50 AM on June 13, 1990   \n",
       "34                                      NULL   \n",
       "35                                       nan   \n",
       "36                          I'm a little cat   \n",
       "37                              This is Sep.   \n",
       "\n",
       "                             date_clean  \n",
       "0   1996.07.10 AD at 23:08:56 UTC+00:00  \n",
       "1   2003.09.25 AD at 18:36:28 UTC+00:00  \n",
       "2   2003.09.25 AD at 18:36:28 UTC+00:00  \n",
       "3   2003.09.25 AD at 18:36:28 UTC+00:00  \n",
       "4   2003.09.25 AD at 18:36:28 UTC+00:00  \n",
       "5   2000.01.01 AD at 18:36:28 UTC+00:00  \n",
       "6   2000.01.01 AD at 18:36:00 UTC+00:00  \n",
       "7   2000.01.01 AD at 18:36:00 UTC+00:00  \n",
       "8   2003.09.25 AD at 08:00:00 UTC+00:00  \n",
       "9   2003.09.25 AD at 08:00:00 UTC+00:00  \n",
       "10  2003.09.01 AD at 08:00:00 UTC+00:00  \n",
       "11  2000.09.01 AD at 08:00:00 UTC+00:00  \n",
       "12  2003.01.01 AD at 08:00:00 UTC+00:00  \n",
       "13  2003.09.25 AD at 08:00:00 UTC+00:00  \n",
       "14  2003.09.25 AD at 08:00:00 UTC+00:00  \n",
       "15  2003.09.25 AD at 08:00:00 UTC+00:00  \n",
       "16  2003.09.25 AD at 08:00:00 UTC+00:00  \n",
       "17  2003.09.25 AD at 08:00:00 UTC+00:00  \n",
       "18  2003.10.09 AD at 08:00:00 UTC+00:00  \n",
       "19  2003.10.09 AD at 08:00:00 UTC+00:00  \n",
       "20  2003.09.25 AD at 08:00:00 UTC+00:00  \n",
       "21  2003.09.25 AD at 08:00:00 UTC+00:00  \n",
       "22  2003.09.25 AD at 08:00:00 UTC+00:00  \n",
       "23  2003.09.25 AD at 08:00:00 UTC+00:00  \n",
       "24  2000.01.02 AD at 06:00:00 UTC+00:00  \n",
       "25  2000.01.01 AD at 20:00:00 UTC+00:00  \n",
       "26  2003.09.01 AD at 08:00:00 UTC+00:00  \n",
       "27  2003.09.01 AD at 08:00:00 UTC+00:00  \n",
       "28  2096.07.10 AD at 08:00:00 UTC+00:00  \n",
       "29  1996.07.10 AD at 23:08:56 UTC+00:00  \n",
       "30  1952.04.12 AD at 23:30:42 UTC+00:00  \n",
       "31  1994.11.05 AD at 16:15:30 UTC+00:00  \n",
       "32  2001.05.03 AD at 08:00:00 UTC+00:00  \n",
       "33  1990.06.13 AD at 13:50:00 UTC+00:00  \n",
       "34                                  NaN  \n",
       "35                                  NaN  \n",
       "36                                  NaN  \n",
       "37                                  NaN  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clean_date(df, 'date', input_timezone='PST', output_timezone='GMT',output_format='yyyy.MM.dd AD at HH:mm:ss Z')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. `fix_missing` parameter\n",
    "This section demostrates valid options of `fix_missing` parameter. The user can specify the way of fixing empty value from value set: {'empty', 'current', 'minimum'}.  The **default fixed_empty** is `'minimum'`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### minimum\n",
    "* For hours, minutes and seconds, just fill them with zeros\n",
    "* For years, months and days, fill it with the minimum value\n",
    "    * Min value of year: 2000\n",
    "    * Min value of month: 1\n",
    "    * Min value of day: 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t34 values cleaned (89.47%)\n",
      "\t2 values unable to be parsed (5.26%), set to NaN\n",
      "Result contains 34 (89.47%) values in the correct format and 4 null values (10.53%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10 15:08:56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "      <td>2000-01-01 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "      <td>2000-01-01 10:36:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "      <td>2000-01-01 10:36:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "      <td>2003-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "      <td>2000-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "      <td>2003-01-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "      <td>2003-10-09 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "      <td>2003-10-09 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "      <td>2000-01-01 22:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "      <td>2000-01-01 12:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "      <td>2003-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "      <td>2003-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "      <td>2096-07-10 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10 15:08:56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "      <td>1952-04-12 15:30:42</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "      <td>1994-11-05 08:15:30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>2001-05-03 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>1990-06-13 05:50:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date           date_clean\n",
       "0              1996.07.10 AD at 15:08:56 PDT  1996-07-10 15:08:56\n",
       "1                   Thu Sep 25 10:36:28 2003  2003-09-25 10:36:28\n",
       "2              Thu Sep 25 10:36:28 BRST 2003  2003-09-25 10:36:28\n",
       "3              2003 10:36:28 BRST 25 Sep Thu  2003-09-25 10:36:28\n",
       "4                   Thu Sep 25 10:36:28 2003  2003-09-25 10:36:28\n",
       "5                               Thu 10:36:28  2000-01-01 10:36:28\n",
       "6                                  Thu 10:36  2000-01-01 10:36:00\n",
       "7                                      10:36  2000-01-01 10:36:00\n",
       "8                            Thu Sep 25 2003  2003-09-25 00:00:00\n",
       "9                                Sep 25 2003  2003-09-25 00:00:00\n",
       "10                                  Sep 2003  2003-09-01 00:00:00\n",
       "11                                       Sep  2000-09-01 00:00:00\n",
       "12                                      2003  2003-01-01 00:00:00\n",
       "13                                2003-09-25  2003-09-25 00:00:00\n",
       "14                               2003-Sep-25  2003-09-25 00:00:00\n",
       "15                               25-Sep-2003  2003-09-25 00:00:00\n",
       "16                               Sep-25-2003  2003-09-25 00:00:00\n",
       "17                                09-25-2003  2003-09-25 00:00:00\n",
       "18                                10-09-2003  2003-10-09 00:00:00\n",
       "19                                  10-09-03  2003-10-09 00:00:00\n",
       "20                               2003.Sep.25  2003-09-25 00:00:00\n",
       "21                                2003/09/25  2003-09-25 00:00:00\n",
       "22                               2003 Sep 25  2003-09-25 00:00:00\n",
       "23                                2003 09 25  2003-09-25 00:00:00\n",
       "24                                      10pm  2000-01-01 22:00:00\n",
       "25                                   12:00am  2000-01-01 12:00:00\n",
       "26                                    Sep 03  2003-09-01 00:00:00\n",
       "27                                 Sep of 03  2003-09-01 00:00:00\n",
       "28                          Wed, July 10, 96  2096-07-10 00:00:00\n",
       "29             1996.07.10 AD at 15:08:56 PDT  1996-07-10 15:08:56\n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST  1952-04-12 15:30:42\n",
       "31          November 5, 1994, 8:15:30 am EST  1994-11-05 08:15:30\n",
       "32                           3rd of May 2001  2001-05-03 00:00:00\n",
       "33                  5:50 AM on June 13, 1990  1990-06-13 05:50:00\n",
       "34                                      NULL                  NaN\n",
       "35                                       nan                  NaN\n",
       "36                          I'm a little cat                  NaN\n",
       "37                              This is Sep.                  NaN"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clean_date(df, 'date', fix_missing='minimum')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### empty\n",
    "Just left the missing component as it is"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t34 values cleaned (89.47%)\n",
      "\t2 values unable to be parsed (5.26%), set to NaN\n",
      "Result contains 34 (89.47%) values in the correct format and 4 null values (10.53%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10 15:08:56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "      <td>---------- 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "      <td>---------- 10:36:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "      <td>---------- 10:36:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "      <td>2003-09-25 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "      <td>2003-09-25 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "      <td>2003-09--- --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "      <td>-----09--- --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "      <td>2003------ --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "      <td>2003-09-25 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>2003-09-25 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "      <td>2003-09-25 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "      <td>2003-09-25 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "      <td>2003-09-25 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "      <td>2003-10-09 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "      <td>2003-10-09 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "      <td>2003-09-25 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "      <td>2003-09-25 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "      <td>2003-09-25 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "      <td>2003-09-25 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "      <td>---------- 22:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "      <td>---------- 12:00:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "      <td>2003-09--- --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "      <td>2003-09--- --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "      <td>2096-07-10 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10 15:08:56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "      <td>1952-04-12 15:30:42</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "      <td>1994-11-05 08:15:30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>2001-05-03 --:--:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>1990-06-13 05:50:--</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date           date_clean\n",
       "0              1996.07.10 AD at 15:08:56 PDT  1996-07-10 15:08:56\n",
       "1                   Thu Sep 25 10:36:28 2003  2003-09-25 10:36:28\n",
       "2              Thu Sep 25 10:36:28 BRST 2003  2003-09-25 10:36:28\n",
       "3              2003 10:36:28 BRST 25 Sep Thu  2003-09-25 10:36:28\n",
       "4                   Thu Sep 25 10:36:28 2003  2003-09-25 10:36:28\n",
       "5                               Thu 10:36:28  ---------- 10:36:28\n",
       "6                                  Thu 10:36  ---------- 10:36:--\n",
       "7                                      10:36  ---------- 10:36:--\n",
       "8                            Thu Sep 25 2003  2003-09-25 --:--:--\n",
       "9                                Sep 25 2003  2003-09-25 --:--:--\n",
       "10                                  Sep 2003  2003-09--- --:--:--\n",
       "11                                       Sep  -----09--- --:--:--\n",
       "12                                      2003  2003------ --:--:--\n",
       "13                                2003-09-25  2003-09-25 --:--:--\n",
       "14                               2003-Sep-25  2003-09-25 --:--:--\n",
       "15                               25-Sep-2003  2003-09-25 --:--:--\n",
       "16                               Sep-25-2003  2003-09-25 --:--:--\n",
       "17                                09-25-2003  2003-09-25 --:--:--\n",
       "18                                10-09-2003  2003-10-09 --:--:--\n",
       "19                                  10-09-03  2003-10-09 --:--:--\n",
       "20                               2003.Sep.25  2003-09-25 --:--:--\n",
       "21                                2003/09/25  2003-09-25 --:--:--\n",
       "22                               2003 Sep 25  2003-09-25 --:--:--\n",
       "23                                2003 09 25  2003-09-25 --:--:--\n",
       "24                                      10pm  ---------- 22:--:--\n",
       "25                                   12:00am  ---------- 12:00:--\n",
       "26                                    Sep 03  2003-09--- --:--:--\n",
       "27                                 Sep of 03  2003-09--- --:--:--\n",
       "28                          Wed, July 10, 96  2096-07-10 --:--:--\n",
       "29             1996.07.10 AD at 15:08:56 PDT  1996-07-10 15:08:56\n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST  1952-04-12 15:30:42\n",
       "31          November 5, 1994, 8:15:30 am EST  1994-11-05 08:15:30\n",
       "32                           3rd of May 2001  2001-05-03 --:--:--\n",
       "33                  5:50 AM on June 13, 1990  1990-06-13 05:50:--\n",
       "34                                      NULL                  NaN\n",
       "35                                       nan                  NaN\n",
       "36                          I'm a little cat                  NaN\n",
       "37                              This is Sep.                  NaN"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clean_date(df, 'date', fix_missing='empty')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### current\n",
    "* For hours, minutes and seconds, just fill them with nearest time value\n",
    "* For years, months and days, fill it with the nearest date"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t34 values cleaned (89.47%)\n",
      "\t2 values unable to be parsed (5.26%), set to NaN\n",
      "Result contains 34 (89.47%) values in the correct format and 4 null values (10.53%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10 15:08:56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "      <td>2021-05-13 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "      <td>2021-05-13 10:36:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "      <td>2021-05-13 10:36:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "      <td>2003-09-25 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "      <td>2003-09-25 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "      <td>2003-09-13 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "      <td>2021-09-13 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "      <td>2003-05-13 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "      <td>2003-09-25 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>2003-09-25 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "      <td>2003-09-25 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "      <td>2003-09-25 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "      <td>2003-09-25 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "      <td>2003-10-09 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "      <td>2003-10-09 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "      <td>2003-09-25 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "      <td>2003-09-25 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "      <td>2003-09-25 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "      <td>2003-09-25 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "      <td>2021-05-13 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "      <td>2021-05-13 12:00:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "      <td>2003-09-13 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "      <td>2003-09-13 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "      <td>2096-07-10 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10 15:08:56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "      <td>1952-04-12 15:30:42</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "      <td>1994-11-05 08:15:30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>2001-05-03 22:05:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>1990-06-13 05:50:44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date           date_clean\n",
       "0              1996.07.10 AD at 15:08:56 PDT  1996-07-10 15:08:56\n",
       "1                   Thu Sep 25 10:36:28 2003  2003-09-25 10:36:28\n",
       "2              Thu Sep 25 10:36:28 BRST 2003  2003-09-25 10:36:28\n",
       "3              2003 10:36:28 BRST 25 Sep Thu  2003-09-25 10:36:28\n",
       "4                   Thu Sep 25 10:36:28 2003  2003-09-25 10:36:28\n",
       "5                               Thu 10:36:28  2021-05-13 10:36:28\n",
       "6                                  Thu 10:36  2021-05-13 10:36:44\n",
       "7                                      10:36  2021-05-13 10:36:44\n",
       "8                            Thu Sep 25 2003  2003-09-25 22:05:44\n",
       "9                                Sep 25 2003  2003-09-25 22:05:44\n",
       "10                                  Sep 2003  2003-09-13 22:05:44\n",
       "11                                       Sep  2021-09-13 22:05:44\n",
       "12                                      2003  2003-05-13 22:05:44\n",
       "13                                2003-09-25  2003-09-25 22:05:44\n",
       "14                               2003-Sep-25  2003-09-25 22:05:44\n",
       "15                               25-Sep-2003  2003-09-25 22:05:44\n",
       "16                               Sep-25-2003  2003-09-25 22:05:44\n",
       "17                                09-25-2003  2003-09-25 22:05:44\n",
       "18                                10-09-2003  2003-10-09 22:05:44\n",
       "19                                  10-09-03  2003-10-09 22:05:44\n",
       "20                               2003.Sep.25  2003-09-25 22:05:44\n",
       "21                                2003/09/25  2003-09-25 22:05:44\n",
       "22                               2003 Sep 25  2003-09-25 22:05:44\n",
       "23                                2003 09 25  2003-09-25 22:05:44\n",
       "24                                      10pm  2021-05-13 22:05:44\n",
       "25                                   12:00am  2021-05-13 12:00:44\n",
       "26                                    Sep 03  2003-09-13 22:05:44\n",
       "27                                 Sep of 03  2003-09-13 22:05:44\n",
       "28                          Wed, July 10, 96  2096-07-10 22:05:44\n",
       "29             1996.07.10 AD at 15:08:56 PDT  1996-07-10 15:08:56\n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST  1952-04-12 15:30:42\n",
       "31          November 5, 1994, 8:15:30 am EST  1994-11-05 08:15:30\n",
       "32                           3rd of May 2001  2001-05-03 22:05:44\n",
       "33                  5:50 AM on June 13, 1990  1990-06-13 05:50:44\n",
       "34                                      NULL                  NaN\n",
       "35                                       nan                  NaN\n",
       "36                          I'm a little cat                  NaN\n",
       "37                              This is Sep.                  NaN"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clean_date(df, 'date', fix_missing='current')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. `infer_day_first` parameter\n",
    "If `infer_day_first = True`, the `clean_date` funtion infers the day number in an ambiguous string column automatically.\n",
    "\n",
    "If `infer_day_first = False`, do nothing.\n",
    "\n",
    "By default, `infer_day_first = True`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t6 values cleaned (100.0%)\n",
      "Result contains 6 (100.0%) values in the correct format and 0 null values (0.0%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>12-01-06</td>\n",
       "      <td>2006-01-12 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>12-04-06</td>\n",
       "      <td>2006-04-12 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>12-05-06</td>\n",
       "      <td>2006-05-12 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>20-12-06</td>\n",
       "      <td>2006-12-20 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>21-12-06</td>\n",
       "      <td>2006-12-21 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>29-12-06</td>\n",
       "      <td>2006-12-29 00:00:00</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       date           date_clean\n",
       "0  12-01-06  2006-01-12 00:00:00\n",
       "1  12-04-06  2006-04-12 00:00:00\n",
       "2  12-05-06  2006-05-12 00:00:00\n",
       "3  20-12-06  2006-12-20 00:00:00\n",
       "4  21-12-06  2006-12-21 00:00:00\n",
       "5  29-12-06  2006-12-29 00:00:00"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "df = pd.DataFrame(\n",
    "    {'date': ['12-01-06', '12-04-06', '12-05-06', \n",
    "              '20-12-06', '21-12-06', '29-12-06']})\n",
    "\n",
    "from dataprep.clean import clean_date\n",
    "clean_date(df, 'date')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t6 values cleaned (100.0%)\n",
      "Result contains 6 (100.0%) values in the correct format and 0 null values (0.0%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>12-01-06</td>\n",
       "      <td>2006-12-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>12-04-06</td>\n",
       "      <td>2006-12-04 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>12-05-06</td>\n",
       "      <td>2006-12-05 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>12-20-06</td>\n",
       "      <td>2006-12-20 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>12-21-06</td>\n",
       "      <td>2006-12-21 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>12-29-06</td>\n",
       "      <td>2006-12-29 00:00:00</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       date           date_clean\n",
       "0  12-01-06  2006-12-01 00:00:00\n",
       "1  12-04-06  2006-12-04 00:00:00\n",
       "2  12-05-06  2006-12-05 00:00:00\n",
       "3  12-20-06  2006-12-20 00:00:00\n",
       "4  12-21-06  2006-12-21 00:00:00\n",
       "5  12-29-06  2006-12-29 00:00:00"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.DataFrame(\n",
    "    {'date': ['12-01-06', '12-04-06', '12-05-06', \n",
    "              '12-20-06', '12-21-06', '12-29-06']})\n",
    "\n",
    "from dataprep.clean import clean_date\n",
    "clean_date(df, 'date')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t6 values cleaned (100.0%)\n",
      "Result contains 6 (100.0%) values in the correct format and 0 null values (0.0%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>12-01-06</td>\n",
       "      <td>2006-12-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>12-04-06</td>\n",
       "      <td>2006-12-04 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>12-05-06</td>\n",
       "      <td>2006-12-05 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>20-12-06</td>\n",
       "      <td>2006-12-20 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>21-12-06</td>\n",
       "      <td>2006-12-21 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>29-12-06</td>\n",
       "      <td>2006-12-29 00:00:00</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       date           date_clean\n",
       "0  12-01-06  2006-12-01 00:00:00\n",
       "1  12-04-06  2006-12-04 00:00:00\n",
       "2  12-05-06  2006-12-05 00:00:00\n",
       "3  20-12-06  2006-12-20 00:00:00\n",
       "4  21-12-06  2006-12-21 00:00:00\n",
       "5  29-12-06  2006-12-29 00:00:00"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from dataprep.clean import clean_date\n",
    "clean_date(df, 'date', infer_day_first = False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. `report` parameter\n",
    "If `report = True`, a report contains:\n",
    "\n",
    "* How many values are cleaned\n",
    "* How many values are unable to cleaned (due to their invalid format)\n",
    "* How many values are with correct format\n",
    "* How many null values are there\n",
    "\n",
    "will be generated.\n",
    "\n",
    "If `report = False`, the report won't be generated."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dates Cleaning Report:\n",
      "\t34 values cleaned (89.47%)\n",
      "\t2 values unable to be parsed (5.26%), set to NaN\n",
      "Result contains 34 (89.47%) values in the correct format and 4 null values (10.53%)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10 15:08:56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "      <td>2000-01-01 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "      <td>2000-01-01 10:36:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "      <td>2000-01-01 10:36:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "      <td>2003-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "      <td>2000-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "      <td>2003-01-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "      <td>2003-10-09 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "      <td>2003-10-09 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "      <td>2000-01-01 22:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "      <td>2000-01-01 12:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "      <td>2003-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "      <td>2003-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "      <td>2096-07-10 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10 15:08:56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "      <td>1952-04-12 15:30:42</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "      <td>1994-11-05 08:15:30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>2001-05-03 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>1990-06-13 05:50:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date           date_clean\n",
       "0              1996.07.10 AD at 15:08:56 PDT  1996-07-10 15:08:56\n",
       "1                   Thu Sep 25 10:36:28 2003  2003-09-25 10:36:28\n",
       "2              Thu Sep 25 10:36:28 BRST 2003  2003-09-25 10:36:28\n",
       "3              2003 10:36:28 BRST 25 Sep Thu  2003-09-25 10:36:28\n",
       "4                   Thu Sep 25 10:36:28 2003  2003-09-25 10:36:28\n",
       "5                               Thu 10:36:28  2000-01-01 10:36:28\n",
       "6                                  Thu 10:36  2000-01-01 10:36:00\n",
       "7                                      10:36  2000-01-01 10:36:00\n",
       "8                            Thu Sep 25 2003  2003-09-25 00:00:00\n",
       "9                                Sep 25 2003  2003-09-25 00:00:00\n",
       "10                                  Sep 2003  2003-09-01 00:00:00\n",
       "11                                       Sep  2000-09-01 00:00:00\n",
       "12                                      2003  2003-01-01 00:00:00\n",
       "13                                2003-09-25  2003-09-25 00:00:00\n",
       "14                               2003-Sep-25  2003-09-25 00:00:00\n",
       "15                               25-Sep-2003  2003-09-25 00:00:00\n",
       "16                               Sep-25-2003  2003-09-25 00:00:00\n",
       "17                                09-25-2003  2003-09-25 00:00:00\n",
       "18                                10-09-2003  2003-10-09 00:00:00\n",
       "19                                  10-09-03  2003-10-09 00:00:00\n",
       "20                               2003.Sep.25  2003-09-25 00:00:00\n",
       "21                                2003/09/25  2003-09-25 00:00:00\n",
       "22                               2003 Sep 25  2003-09-25 00:00:00\n",
       "23                                2003 09 25  2003-09-25 00:00:00\n",
       "24                                      10pm  2000-01-01 22:00:00\n",
       "25                                   12:00am  2000-01-01 12:00:00\n",
       "26                                    Sep 03  2003-09-01 00:00:00\n",
       "27                                 Sep of 03  2003-09-01 00:00:00\n",
       "28                          Wed, July 10, 96  2096-07-10 00:00:00\n",
       "29             1996.07.10 AD at 15:08:56 PDT  1996-07-10 15:08:56\n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST  1952-04-12 15:30:42\n",
       "31          November 5, 1994, 8:15:30 am EST  1994-11-05 08:15:30\n",
       "32                           3rd of May 2001  2001-05-03 00:00:00\n",
       "33                  5:50 AM on June 13, 1990  1990-06-13 05:50:00\n",
       "34                                      NULL                  NaN\n",
       "35                                       nan                  NaN\n",
       "36                          I'm a little cat                  NaN\n",
       "37                              This is Sep.                  NaN"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clean_date(df, 'date', report=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>date_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10 15:08:56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Thu Sep 25 10:36:28 BRST 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Thu Sep 25 10:36:28 2003</td>\n",
       "      <td>2003-09-25 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thu 10:36:28</td>\n",
       "      <td>2000-01-01 10:36:28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Thu 10:36</td>\n",
       "      <td>2000-01-01 10:36:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>10:36</td>\n",
       "      <td>2000-01-01 10:36:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Thu Sep 25 2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Sep 25 2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Sep 2003</td>\n",
       "      <td>2003-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Sep</td>\n",
       "      <td>2000-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2003</td>\n",
       "      <td>2003-01-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2003-09-25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>25-Sep-2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Sep-25-2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>09-25-2003</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>10-09-2003</td>\n",
       "      <td>2003-10-09 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>10-09-03</td>\n",
       "      <td>2003-10-09 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2003.Sep.25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>2003/09/25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>2003 Sep 25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>2003 09 25</td>\n",
       "      <td>2003-09-25 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>10pm</td>\n",
       "      <td>2000-01-01 22:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>12:00am</td>\n",
       "      <td>2000-01-01 12:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Sep 03</td>\n",
       "      <td>2003-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Sep of 03</td>\n",
       "      <td>2003-09-01 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Wed, July 10, 96</td>\n",
       "      <td>2096-07-10 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>1996.07.10 AD at 15:08:56 PDT</td>\n",
       "      <td>1996-07-10 15:08:56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Tuesday, April 12, 1952 AD 3:30:42pm PST</td>\n",
       "      <td>1952-04-12 15:30:42</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>November 5, 1994, 8:15:30 am EST</td>\n",
       "      <td>1994-11-05 08:15:30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>2001-05-03 00:00:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>1990-06-13 05:50:00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>NULL</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>nan</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>I'm a little cat</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>This is Sep.</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        date           date_clean\n",
       "0              1996.07.10 AD at 15:08:56 PDT  1996-07-10 15:08:56\n",
       "1                   Thu Sep 25 10:36:28 2003  2003-09-25 10:36:28\n",
       "2              Thu Sep 25 10:36:28 BRST 2003  2003-09-25 10:36:28\n",
       "3              2003 10:36:28 BRST 25 Sep Thu  2003-09-25 10:36:28\n",
       "4                   Thu Sep 25 10:36:28 2003  2003-09-25 10:36:28\n",
       "5                               Thu 10:36:28  2000-01-01 10:36:28\n",
       "6                                  Thu 10:36  2000-01-01 10:36:00\n",
       "7                                      10:36  2000-01-01 10:36:00\n",
       "8                            Thu Sep 25 2003  2003-09-25 00:00:00\n",
       "9                                Sep 25 2003  2003-09-25 00:00:00\n",
       "10                                  Sep 2003  2003-09-01 00:00:00\n",
       "11                                       Sep  2000-09-01 00:00:00\n",
       "12                                      2003  2003-01-01 00:00:00\n",
       "13                                2003-09-25  2003-09-25 00:00:00\n",
       "14                               2003-Sep-25  2003-09-25 00:00:00\n",
       "15                               25-Sep-2003  2003-09-25 00:00:00\n",
       "16                               Sep-25-2003  2003-09-25 00:00:00\n",
       "17                                09-25-2003  2003-09-25 00:00:00\n",
       "18                                10-09-2003  2003-10-09 00:00:00\n",
       "19                                  10-09-03  2003-10-09 00:00:00\n",
       "20                               2003.Sep.25  2003-09-25 00:00:00\n",
       "21                                2003/09/25  2003-09-25 00:00:00\n",
       "22                               2003 Sep 25  2003-09-25 00:00:00\n",
       "23                                2003 09 25  2003-09-25 00:00:00\n",
       "24                                      10pm  2000-01-01 22:00:00\n",
       "25                                   12:00am  2000-01-01 12:00:00\n",
       "26                                    Sep 03  2003-09-01 00:00:00\n",
       "27                                 Sep of 03  2003-09-01 00:00:00\n",
       "28                          Wed, July 10, 96  2096-07-10 00:00:00\n",
       "29             1996.07.10 AD at 15:08:56 PDT  1996-07-10 15:08:56\n",
       "30  Tuesday, April 12, 1952 AD 3:30:42pm PST  1952-04-12 15:30:42\n",
       "31          November 5, 1994, 8:15:30 am EST  1994-11-05 08:15:30\n",
       "32                           3rd of May 2001  2001-05-03 00:00:00\n",
       "33                  5:50 AM on June 13, 1990  1990-06-13 05:50:00\n",
       "34                                      NULL                  NaN\n",
       "35                                       nan                  NaN\n",
       "36                          I'm a little cat                  NaN\n",
       "37                              This is Sep.                  NaN"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clean_date(df, 'date', report=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 7. `validate_date()`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`validate_date()` returns `True` when the input has a valid date format. Otherwise the function returns `False`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "False\n",
      "True\n",
      "False\n"
     ]
    }
   ],
   "source": [
    "from dataprep.clean import validate_date\n",
    "print(validate_date(\"Novvvvvvvvember 5, 1994, 8:15:30 am EST hahaha\"))\n",
    "print(validate_date(\"1994, 8:15:30\"))\n",
    "print(validate_date(\"Hello.\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>messy_date</th>\n",
       "      <th>valid</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>T, Ap 12, 1952 AD 3:30:42p</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5:50 AM on June 13, 1990</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3rd of May 2001</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>55/23/2014</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>10pm</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>10p</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>2003-Sep-25</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Sepppppp</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>23 4 1962</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>2003 10:36:28 BRST 25 Sep Thu</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>hello</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>NaN</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>NULL</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                       messy_date  valid\n",
       "0      T, Ap 12, 1952 AD 3:30:42p  False\n",
       "1        5:50 AM on June 13, 1990   True\n",
       "2                 3rd of May 2001   True\n",
       "3                      55/23/2014   True\n",
       "4                            10pm   True\n",
       "5                             10p   True\n",
       "6                     2003-Sep-25   True\n",
       "7                        Sepppppp  False\n",
       "8                       23 4 1962   True\n",
       "9   2003 10:36:28 BRST 25 Sep Thu   True\n",
       "10                          hello  False\n",
       "11                            NaN  False\n",
       "12                           NULL  False"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.DataFrame({\"messy_date\":\n",
    "                   [\"T, Ap 12, 1952 AD 3:30:42p\", \"5:50 AM on June 13, 1990\", \"3rd of May 2001\", \"55/23/2014\",\n",
    "                    \"10pm\", \"10p\", \"2003-Sep-25\", \n",
    "                    \"Sepppppp\", \"23 4 1962\", \"2003 10:36:28 BRST 25 Sep Thu\", \n",
    "                    \"hello\", np.nan, \"NULL\"]\n",
    "                  })\n",
    "df[\"valid\"] = validate_date(df[\"messy_date\"])\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
